A Model of Neuronal Specialization Using Hebbian Policy-Gradient with "Slow" Noise
نویسنده
چکیده
We study a model of neuronal specialization using a policy gradient reinforcement approach. (1) The neurons stochastically fire according to their synaptic input plus a noise term; (2) The environment is a closed-loop system composed of a rotating eye and a visual punctual target; (3) The network is composed of a foveated retina, a primary layer and a motoneuron layer; (4) The reward depends on the distance between the subjective target position and the fovea and (5) the weight update depends on a Hebbian trace defined according to a policy gradient principle. In order to take into account the mismatch between neuronal and environmental integration times, we distort the firing probability with a “pink noise” term whose autocorrelation is of the order of 100 ms, so that the firing probability is overestimated (or underestimated) for about 100 ms periods. The rewards occuring meanwhile assess the “value” of those elementary shifts, and modify the firing probability accordingly. Every motoneuron being associated to a particular angular direction, we test at the end of the learning process the preferred output of the visual cells. We find that accordingly with the observed final behavior, the visual cells preferentially excite the motoneurons heading in the opposite angular direction.
منابع مشابه
New Improvement in Interpretation of Gravity Gradient Tensor Data Using Eigenvalues and Invariants: An Application to Blatchford Lake, Northern Canada
Recently, interpretation of causative sources using components of the gravity gradient tensor (GGT) has had a rapid progress. Assuming N as the structural index, components of the gravity vector and gravity gradient tensor have a homogeneity degree of -N and - (N+1), respectively. In this paper, it is shown that the eigenvalues, the first and the second rotational invariants of the GGT (I1 and ...
متن کاملSpike-Based Reinforcement Learning in Continuous State and Action Space: When Policy Gradient Methods Fail
Changes of synaptic connections between neurons are thought to be the physiological basis of learning. These changes can be gated by neuromodulators that encode the presence of reward. We study a family of reward-modulated synaptic learning rules for spiking neurons on a learning task in continuous space inspired by the Morris Water maze. The synaptic update rule modifies the release probabilit...
متن کاملA Perfect Specialization Model for Gravity Equation in Bilateral Trade based on Production Structure
Although initially originated as a totally empirical relationship to explain the volume of trade between two partners, gravity equation has been the focus of several theoretic models that try to explain it. Specialization models are of great importance in providing a solid theoretic ground for gravity equation in bilateral trade. Some research papers try to improve specialization models by addi...
متن کاملA reward-modulated Hebbian learning rule can explain experimentally observed network reorganization in a brain control task Abbreviated title: Exploratory Hebbian Learning
It has recently been shown in a brain-computer interface experiment that motor cortical neurons change their tuning properties selectively to compensate for errors induced by displaced decoding parameters. In particular, it was shown that the 3D tuning curves of neurons whose decoding parameters were reassigned changed more than those of neurons whose decoding parameters had not been reassigned...
متن کاملCulturing Adult Rat Hippocampal Neurons with Long-Interval Changing Media
Background: Primary cultures of embryonic neurons have been used to introduce a model of neurons in physiological and pathological conditions. However, age-related cellular events limit this method as an optimal model in adult neurodegenerative diseases studies. Besides, short-interval changing media in previous cultures decreases the effectiveness of this model. As an example of this matter, w...
متن کامل